The evolution of genomes and language.

نویسنده

  • Hong-Yu Zhang
چکیده

748 Since Charles Darwin published The Origin of Species in 1859, his theory of evolution has been a central axiom for the biological sciences. Scientists have devoted considerable effort to unravelling the evolutionary mechanisms of biological, and other, systems. Their work—in particular the discovery of DNA as the carrier of hereditary information—has led to the now nearly universally accepted idea that evolution causes the increasing complexity of most biological systems. However, although the basic mechanisms of evolution—mutation and selection—are clear, accumulating data suggest that the means to increase complexity have also become more complex. This begs the question how such mechanisms have evolved over time. Thanks to rapid progress in the biological sciences—in particular in the field of genomics—and the information sciences including linguistics, we have amassed an enormous amount of data on genomes and biological evolutionary mechanisms. These are providing an unprecedented opportunity to investigate the possible transformation of evolutionary strategies of evolving systems, such as genomes and languages. In fact, the information contained in genomes has long been compared with languages, and many linguistic methodologies are now used to analyse genomes (Searls, 2002). If we compare the evolution of genomes and language, we serendipitously find that both systems have undergone similar strategic shifts to attain increasing complexity, which suggests that this is an intrinsic property of many evolving systems, not just biological ones. During the primary stage in the evolution of both genomes and languages, increasing complexity was achieved mainly by increasing the number of basic information-carrying elements: nouns and verbs in language, and genes in biology. The Chinese language is a good example to use in this context, because of its history of more than 6,000 years and its continuing evolution. The early Chinese written language, called Oracle, consisted of a few thousand characters, which gradually increased to include more than 47,000 individual characters, as the ancient Chinese continually invented characters to refine their ability to describe their environment (Table 1; Ji, 1989). For example, more than 70 characters described horses on the basis of their colour, age and gender, and more than 80 characters described their behaviour (Chen, 1936). According to a statistical analysis of 800 million ancient characters, one would need to know about 22,000 characters to attain 99.99% coverage of ancient Chinese (Zhang, 2004). The same phenomenon can be observed in biological evolution. Simple and early organisms, such as prokaryotes, manage to survive and proliferate with a few thousand genes, whereas the number of genes in higher organisms is about a magnitude higher—some plants, such as maize, have more than 50,000 genes (Messing et al, 2004). In the second stage of their evolution, both systems began to rearrange existing elements into new combinations to increase complexity further. Interestingly, this leads to a decrease in the quantity of elements. For example, modern Chinese uses considerably fewer characters than did the ancient language: only 4,600 characters are now needed to attain 99.99% coverage (Zhang, 1997). However, modern Chinese is definitely more powerful than its ancient predecessor, because it is able to describe a much more complex world. The most frequently used 3,500 characters—covering about 99.87% of modern Chinese—can be combined to form more than 70,000 words (Zhang, 1997), which include the meanings of most ancient characters. Genomes have undergone a similar evolutionary shift. Mammals— such as Homo sapiens and the mouse—have about 25,000 genes (International Human Genome Sequencing Consortium, 2004; Guénet, 2005), which is much fewer than some plants: for example, rice has 43,000 genes (Paterson et al, 2005) and maize has 59,000 (Messing et al, 2004). The greater complexity of mammals is explained partly by the recombination of existing genes, through mechanisms such as alternative splicing ( Johnson et al, 2003) and tandem chimerism (Parra et al, 2006). In fact, mammals depend much more on gene recombination to achieve higher complexity than do plants (Messing, 2001). The third stage of evolution witnesses the arrival of ‘virtual’, or modifying, elements, which have an important role as regulatory components. In language, adverbs, auxiliary words, prepositions and conjunctions are all virtual elements, whereas nouns and verbs are the main carriers of information. In fact, the five most frequently used characters in modern Chinese contain two ‘empty’ elements (Fig 1). Although virtual words were quite rare in Oracle, they are much more frequent in modern Chinese. Similarly, noncoding RNA has a virtual role in genomes, whereas protein-coding genes are the main carriers of information. In mammals, it is RNA, not protein, that mainly controls gene activity (Mattick, 2004), which also helps to explain the unexpectedly low number of proteincoding genes in mammalian genomes. The evolution of genomes and language Table 1 | Evolution of Chinese characters (a complete list of sources is listed in Ji (1989))

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting CpG Islands and Their Relationship with Genomic Feature in Cattle by Hidden Markov Model Algorithm

Cattle supply an important source of nutrition for humans in the world. CpG islands (CGIs) are very important and useful, as they carry functionally relevant epigenetic loci for whole genome studies. As a matter of fact, there have been no formal analyses of CGIs at the DNA sequence level in cattle genomes and therefore this study was carried out to fill the gap. We used hidden markov model alg...

متن کامل

O-44: Characterisation of Monotreme CaseinsReveals Lineage Specific Expansion of an AncestralCasein Locus in Mammals

Background: One important reproductive characteristic of Mammals is the production of milk to nurse the neonate. In order to better understand the evolution of milk we have investigated gene expression in milk cells from monotremes which are the most ancient representative of the mammalian lineage. Materials and Methods: Using a milk cell cDNA sequencing approach we characterise milk protein se...

متن کامل

Coronavirus 2 Acute respiratory syndrome: Emergence, Evolution and thrapeutic prevention strategies

The ongoing outbreak of COVID-19 that began in Wuhan, China, has constituted a Public Health Emergency of International Concern, and spread all over the world. In a phylogenetic network analysis of human severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2) genomes, three central variants were distinguished by amino acid changes, which named A, B, and C; with A being the ancestral type a...

متن کامل

Dynamic Categorization of Semantics of Fashion Language: A Memetic Approach

Categories are not invariant. This paper attempts to explore the dynamic nature of semantic category, in particular, that of fashion language, based on the cognitive theory of Dawkins’ memetics, a new theory of cultural evolution. Semantic attributes of linguistic memes decrease or proliferate in replication and spreading, which involves a dynamic development of semantic category. More specific...

متن کامل

A Comparative Study of English and Persian Advertising Slogans: Linguistic Means through the Sands of Time

This study was a contrastive analysis of the evolution of English and Persian advertising slogans to investigate their similarities/differences in using rhetorical figures, and the evolution in the use of these figures in the slogans of each language. Thus, 800 Persian and English slogans from the last four decades were collected. Lapsanka's framework (2006) including different aspects with som...

متن کامل

The evolution of the meaning of the word nurse based on the classical texts of Persian literature

Background and Aim: The semantic evolution of a word over time is inevitable, indicating a social, political, religious or cultural process. Nurse is one of the words that has a significant presence in Persian literature texts and has been used in many different meanings such as slave, servan, maid, devotee, obedient, patient and preserver. The purpose of this study is to show its semantic ev...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EMBO reports

دوره 7 8  شماره 

صفحات  -

تاریخ انتشار 2006